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Abstract 

We construct two families of deterministic sensing matrices where the columns are obtained 
by exponentiating codewords in the quaternary Delsarte-Goethals code DG{m, r). This method 
of construction results in sensing matrices with low coherence and spectral norm. The first family, 
which we call Delsarte-Goethals frames, are 2"^ - dimensional tight frames with redundancy 
2 ■ 2™. The second family, which we call Delsarte-Goethals sieves, are obtained by subsampling the 

\ column vectors in a Delsarte-Goethals frame. Different rows of a Delsarte-Goethals sieve may not 

5_i ■ be orthogonal, and we present an effective algorithm for identifying all pairs of non-orthogonal 

Qh! rows. The pairs turn out to be duplicate measurements and eUminating them leads to a tight 

■ frame. Experimental results suggest that all DG{m, r) sieves with m < 15 and r > 2 are tight- 
frames; there are no dupUcate rows. For both famiUes of sensing matrices, we measure accuracy 

^ ■ of reconstruction (statistical — 1 loss) and complexity (average reconstruction time) as a function 

of the sparsity level k. Our results show that DG frames and sieves outperform random Gaussian 
matrices in terms of noiseless and noisy signal recovery using the LASSO. 

c/3 i Index Terms 

O 

Compressed Sensing, Reed-Muller Codes, Delsarte-Goethals Set, Random Sub-dictionary, LASSO 

■ L Introduction 

^ . The central goal of compressed sensing is to capture attributes of a signal using very few measurements. 

'nJ" I In most work to date, this broader objective is exemplified by the important special case in which the 

Tij- ■ measurement data constitute a vector / = <J>a + e, where <I> is an x C matrix called the sensing matrix, 

^ . a is a signal in C'', that is well-approximated by a A;-sparse vector (a signal with at most k non-zero 

entries), and e is additive measurement noise. 

The role of random measurement in compressive sensing (see HI and 111) can be viewed as analogous 
to the role of random coding in Shannon theory. Both provide worst-case performance guarantees in 
the context of an adversarial signal/error model. In the standard paradigm, the measurement matrix is 
\ required to act as a near isometry on all /c-sparse signals (this is the Restricted Isometry Property or RIP 

introduced in ||3l). It has been shown that if a sensing matrix satisfies the RIP property then Basis pursuit 
|[D> El programs can be used to estimate the best /c-term approximation of any signal in C'', measured 
in the presence of any I2 norm bounded measurement noise [5J- 

It is known that certain probabilistic processes generate sensing matrices that for k = 0{N) satisfy fc-RIP 
with high probability (see IH). This is significantly different from the best known results for deterministic 
sensing matrices Q where /c-RIP is known only for k = 0{\fN). We normalize the columns of a sensing 
matrix to have unit £2 - norm and define the worst case coherence /i to be the maximum absolute value 
of an inner product of distinct columns. It follows from the Welch bound [SJ that ii>0 {^-^^ ■ When 
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fj, = O (^"^j it thfin follows from the Gerschgorin Circle Theorem ||9l that the sensing matrix satisfies 
/c-RIP with A; = O In general however no polynomial-time algorithm is known for verifying that 

a sensing matrix with the worst-case coherence /i satisfies /c-RIP with k = Q, 

The RIP property is not an end in itself. It provides guarantees for a particular method of signal 
reconstruction, but there is significant interest in structured sensing matrices and alternative reconstruction 
algorithms. One example is the adjacency matrices of expander graphs lITOl . ifTTl where it is known to be 
impossible to satisfy RIP with respect to the £2 norm |[T2l . Sparse signal recovery is still possible with 
Basis Pursuit since the adjacency matrix acts like a near isometry on k-sparse signals with respect to the 
£1 norm. However error estimates are looser than corresponding estimates for random sensing matrices 
and resilience to measurement noise is limited to sparse noise vectors. 

The coherence between rows of a sensing matrix is a measure of the new information provided by an 
additional measurement. The coherence between columns of a sensing matrix is fundamental to deriving 
performance guarantees for reconstruction algorithms such as Basis Puruit. There are two fundamental 
measures of coherence: The worst-case coherence which measures the maximal coherence between 
the columns of the sensing matrix, and the spectral norm ||^||2 which measures the maximal coherence 
between the rows of the frame. The ideal case is when worst case coherence between columns matches the 
Welch bound (^fi = O (^-^^^ ^^d different measurements are orthogonal. Then, with high probability a 
fc-sparse vector has a unique sparse representation |[T3l . and this representation can be efficiently recovered 
using a LASSO program |[T4l . Section ^ introduces notation and reviews prior work on the geometry 
of sensing matrices and the performance of the LASSO reconstruction algorithm. 

In this paper we consider sensing matrices based on the Z4-linear representation of Delsarte Goethals 
codes. The columns are obtained by exponentiating codewords in the quaternary Delsarte-Goethals code; 
they are uniformly and very precisely distributed over the surface of an A^-dimensional sphere. Coherence 
between columns reduces to properties of these algebraic codes. Section previews the construction of 
Delsarte-Goethals (DG) sets of Z4-linear quadratic forms which is the starting point for the construction 
of the corresponding codes; each quadratic form determines a codeword where the entries are the values 
taken by quadratic form. Section ^fllD introduces Delsarte-Goethals frames and Delsarte-Goethals sieves; 
the columns of these sensing matrices are obtained by exponentiating DG codewords. We then determine 
the worst case coherence and spectral norm for these sensing matrices. 

Candes and Plan |[T4l specified coherence conditions under which a LASSO program will successfully 
recover a /c-sparse signal when the k non-zero entries are above the noise variance. We use these results 
to provide an average case error analysis for stochastic noise in both the data and measurement domains. 
The Delsarte Goethals (DG) sensing matrices are essentially tight frames so that white noise in the data 
domain maps to white noise in the measurement domain. 

Section ^IVI presents the results of numerical experiments that compare DG frames and sieves with random 
Gaussian matrices of the same size. The SpaRSA package lITSl is used to implement the LASSO recovery 
algorithm in all cases. DG frames and sieves outperform random matrices in terms of probability of 
successful sparse recovery but reconstruction time for the DG sieve is greater than that for the other sensing 
matrices. We remark that there are alternative fast reconstruction algorithms that exploit the structure of 
DG sensing matrices. The witnessing algorithm proposed in fTS] requires less storage, provides support- 
localized detection, and does not require independence among the support entries. On the other hand, 
LASSO reconstruction tends to be more robust to noise in the data domain. 

II. Background and Notation 
This Section introduces notation and reviews the theory of sparse reconstruction. 



3 



A. Notation 

Given a vector v = {vi,--- ,Vn) in M", ||7;||2 denotes the Euclidean norm of v, and \\v\\i denotes 
the ii norm of v defined as = X^ILil^'l- further define ll-yHoo = max{|fi|,--- ,|un|}, and 
ll^llmin = min{|t;i|, • • • ,|un|}- Also the Hamming weight of v is defined as ||f||o = {i : 7^ 0}. 
Whenever clear from the context, we drop the subscript from the £2 norm. Also Vi^j denotes the vector 
V restricted to entries i,i + 1, - ■ ■ ,j, that is Vi^j = {vi,Vi+i, ■ ■ ■ , vj). 

Let A be a matrix with rank r. We denote the conjugate transpose of A by A"^. Let cr = [ai, • • • ,ar] 
denote the vector of the singular values of A. The spectral norm \\A\\ of a matrix A is the largest singular 
value of A: that is ||^|| = ||<t||oo- The condition number of $ is the ratio between its largest and its 
smaller singular values: = jj^jp^- Finally the nuclear norm of A, denoted as ||A||i is the £1 norm 
of the singular value vector a. 

Throughout this paper we shall use the notation ipj for the j^^ column of the sensing matrix <^>; its entries 
will be denoted by (pj{x), with the row label x varying from to — 1. In other words, ipj{x) is the 
entry of $ in row x and column j. We denote the set {!,••• .0} by [C]. Let S" be a subset of [C]. is 
obtained by restricting <I> to those columns that are listed in S. 

A vector a G M'' is /c-sparse if it has at most k non-zero entries. The support of the A;-sparse vector 
a, denoted by Supp(a), contains the indices of the non-zero entries of a. Let vr = {tti,--- ,ttc} be 
a uniformly random permutation of [C]. In this paper, our focus is on the average case analysis, and 
we always assume that a is a /c-sparse signal with Supp(a) = {vri,--- ,71^}- We further assume that 
conditioned on the support, the values of the k non-zero entries of a are sampled from a distribution 
which is absolutely continuous with respect to the Lebesgue measure on M^. 



B. Incoherent Tight Frames 

An N X C matrix $ with normalized columns is called a dictionary. A dictionary is a tight-frame with 
redundancy ^ if for every vector v € M'', = jf If = jj^NxN, then $ is a tight-frame 

with redundancy (see ifTTl ). 

Proposition 1. Let ^ be an N x C dictionary. Then \\^\\ > j^, and equality holds if and only if ^ is a 
tight frame with redundancy jj. 

Proof: Let Let a be the singular value vector of <I>. We have 

N 



$f = Ml >iy<^' = ^Tr (cDcDt) = ^. (1) 

°° - N ' N \ J N 



The inequality in Equation ([T|) changes to equality if and only if all the eigenvalues of ^^"^ are equal to 
j^. This is equivalent to the requirement = ^^nxN- ■ 

The mutual coherence between the columns of an x C sensing matrix is defined as 



fi = max 



(2) 



Strohmer and Heath HI showed that the mutual coherence of any N x C dictionary is at least 
Designing dictionaries with small spectral norms (tight frames in the ideal case), and with small coherence 
^/i = O (^"^^ in the ideal case^ is useful in compressed sensing for the following reasons. 
Uniqueness of Sparse Representation {£0 minimization) The following results are due to Tropp |[T3l 
and show that with overwhelming probability the Iq minimization program successfully recovers the 
original A;-sparse signal. 
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Theorem 1. Assume the dictionary ^ satisfies fj, < 



assume k < ,,^,,1 . — 
submatrix. Then there exists an absolute constant cq 



logC 



where c is an absolute constant. Further 



. Let S be a random subset of [C] of size k, and let $5 be the corresponding N x k 



Pr 



s^s 



> Co /ilogC + 2 



C 



< 2C~ 



Theorem 2. Assume the dictionary ^ satisfies < j^^, where c is an absolute constant. Further assume 
k < ll^ll^ kigC " a be a k-sparse vector, such that the support of the k nonzero coefficients of a is 
selected uniformly at random. Then with probability l — O (C~^) a is the unique k-sparse vector mapped 
to u = $a by the measurement matrix 

Sparse Recovery via LASSO {£1 minimization) Uniqueness of sparse representation is of limited utility 
given that £0 minimization is computationally intractable. However, given modest restrictions on the class 
of sparse signals, Candes and Plan fl4\ have shown that with overwhelming probability the solution to 
the if) minimization problem coincides with the solution to a convex lasso program. 

Theorem 3. Assume the dictionary $ satisfies p, < jj^. where c is an absolute constant. Further assume 
k < ii^p logc ^^^^^ ci is a numeric constant. Let a be a k-sparse vector, such that 

1) The support of the k nonzero coefficients of a is selected uniformly at random. 

2) Conditional on the support, the signs of the nonzero entries of a are independent and equally likely 
to be —1 or 1. 



Let u = $a + e, where e contains N iid J\f{0,a'^) Gaussian elements. Then if \\a\ 
with probability 1 — 0{C~^) the lasso estimate 



> 8a^/2\SgC, 



a 



arg min —\\u 

a+6M^ 2 



+ 2^/2logCa'^ ||a+||i 



has the same support and sign as a, and \\^a — <I>a*|p < 02 ka'^, where 02 is a numeric constant. 

Stochastic noise in the data domain. The tight-frame property of the sensing matrix makes it possible 
to map iid Gaussian noise in the data domain to iid Gaussian noise in the measurement domain: 

Lemma 1. Let e be a vector with C iid J\f{0, o"^) entries and e be a vector with N iid Af{0, o"^) entries. 
Let h = and u = h+e. Then v contains N entries, sampled iid from Af (O, a^), where = ^cr^-\-(Tm- 

Proof: The tight frame property implies 



E 



N 



Therefore, v = h-'r e contains iid Gaussian elements with zero mean and variance a^. 
Next we construct two families of low-coherence tight frames from Delsarte-Goethals codes. 



C. Delsarte-Goethals Sets of Binary Symmetric Matrices 

The finite field F2™ is obtained from the binary field F2 by adjoining a root ^ of a primitive irreducible 
polynomial g of degree m. The elements of F2>" are polynomials in ^ of degree at most m — 1 with 
coefficients in F2, and we will identify the polynomial xq + xi^ + • • • + Xm-iC™~^ with the binary 
m-tuple [xq, ■ ■ ■ ,Xm-i) ■ The Frobenius map f : F2™ — )• F2'" is defined by f{x) = and the Trace 
map Tr : ¥2^^ — )• F2 is defined by 

Tr(x) = X + x^ H h x^"*"'. 
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The identity {x + y)"^ = + implies that Tr(x + y) = Tr(x) + Tr(y); the trace is a Unear map over 
the binary field F2. The trace inner product given by [v, w) = Tx{vw) is non-degenerate; if Tr{vz) = 
for all z in then t; = 0. Every element a in ¥2"^ determines a symmetric bilinear form Tx[xya\ to 
which is associated a binary symmetric matrix P^{a). 

Tx[xya] = (xq • • • Xm-i)P^{a){yo ■ ■ ■ Vm-iV ■ 

The Kerdock set is the m-dimensional binary vector space formed by the matrices P^{a). For 

example, let m = 3, and assume the finite field Fg is generated by adjoining a root ^ of the polynomial 

g{x) = x'^ + X + 1. Then K^, is spanned by 



P°(100) 



1 

1 1 , P°(010) 




and P°(001) 




1 

Theorem 4. Every nonzero matrix in Km is nonsingular. 

Proof: If xP^{a) = then Tr[xya] = for all y G F2'". Now the non-degeneracy of the trace 
implies a = 0. ■ 

Next we define higher order bilinear forms, each associated with a binary symmetric matrix. Given a 
positive integer t where < t < ^^^^ and given a field element a 



Tr 



xy + X y] a 



defines a symmetric bilinear form that is represented by a binary symmetric matrix P^{a) as above: 



Tr 



(xo • • • Xm~i)P\a){yo ■ ■ ■ ym.^i) 



{xy"^' + x^ yja 
The Delsarte-Goethals set DG{m, r) is then defined as 

DG{m,r) = f^^P\at)\at G F2-., t = 0, 1 

The Delsarte-Goethals sets are nested 

Km = DG{m, 0) C DG{m, 1) C ■ ■ ■ C DG [ m 



(3) 



■ ■ ■ ,r 



m — 1 



m— 1 ^ 



and every bilinear form is associated with some matrix in DG [m, — j . 
For example, let m = 3 and g{x) = x^ + x + 1, the set DG{3, 1) is spanned by K^, and 





P^(IOO) = I 1 I , P\OW) 
1 



1 

1 I , and P^(OOl) 





Theorem 5. Every nonzero matrix in DG{m, r) has rank at least m — 2r. 
Proof: If X is in the null space of X]i=o then for all y G F2". 



Tr 



xyao + ^ i^xy"^' +x'^'y) at 
t=i 



0. 
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Since Tr(x) = Tr(x^) 



Tr I x 2 1 we have 



Tr 



2r 2' + -- 

X 



t=i 



0. 



Non-degeneracy of the trace now impUes 

r 

(xoo)^ + ^ (xoj)^ + af x^''^" = 0. 

This is a polynomial of degree at most 2^*" so there are at most 2'^^ solutions. Hence the rank of the 
binary symmetric matrix Ylt=o ^^i'^t) is at least m — 2r. ■ 



III. Delsarte-Goethals Sensing 

A. Delsarte-Goethals Frames 

We start by picking an odd number m. The 2™ rows of the sensing matrix <^ are indexed by the binary 
m-tuples X, and the 2^''+^)™ columns are indexed by the pairs P, h, where P is an m x m binary symmetric 
matrix in the Delsarte-Goethals set DG{m, r), and 6 is a binary ?7T,-tuple. The entry ipp^b{x) is given by 

^p4x) = A^^-^-^+2fe-" (4) 

Note that all arithmetic in the expressions xPx^ + 2bx~^ takes place in the ring of integers modulo 4. 
Given P, b the vector xPx~^ + 2bx~^ is a codeword in the Delsarte-Goethals code (defined over the ring 
of integers modulo 4). For a fixed matrix P, the 2™ columns ^pp^b , b £ form an orthonormal basis. 
The name Delsarte-Goethals frame (DG frame) reflects the fact that <l> is a union of orthonormal bases. 
Hence, it is a tight-frame with redundancy Delsarte-Goethals frames are highly incoherent (see ifTTl ): 

Proposition 2. Let m and r be non-negative integers where m is odd and r < Then the worst 

case coherence fi of the sensing matrix derived from the DG{m,r ) set satisfies /i < . 

Sensing matrices derived from Delsarte-Goethals sets are incoherent tight frames so the results of 
Section can be brought to bear. The N x N"^ sensing matrix derived from the Kerdock set is the 
union of mutually unbiased bases and the worst case coherence matches the lower bound derived by 
Levenshtein ifTSl (see also Strohmer and Heath |[8l). 



B. Delsarte-Goethals Sieves 

Chirp Detection lUTl and Witness Averaging |[T9l are fast reconstruction algorithms that exploit the 
structure of Delsarte-Goethals frames. By sieving the testimony of witnesses |[T9ll it is possible to detect the 
presence or absence of a signal at any given position in the data domain without explicitly reconstructing 
the entire signal. 

There is however an aliasing problem with DG frames. When two signals modulate columns in the same 
orthonormal basis, spurious tones are generated by both the chirp detection and witness interrogation 
algorithms. This can be resolved by decimating the DG frame so that no two columns share the same 
binary symmetric matrix P. The simplest way to do this is to retain columns 

^p{x) = (5) 
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TABLE I: Spectral norms of DG{m, 1) frames and DG{m, 1) sieves as a function of m 



DG{m, 1) 


m = 3 


m = 5 


m = 7 


m = 9 


Frame 


2.8284 


5.6569 


11.3137 


22.6274 


Sieve 


5.6568 


11.1295 


25.0386 


55.0338 



for which 6 = 0. We call these subsampled matrices Delsarte-Goethals sieves {DG{m, r) sieves) since it 
is still possible to sieve the testimony of witnesses. Note that each column of a DG sieve, is a column of 
the corresponding DG sieve, and the worst case coherence bound follows from Proposition |2l Figure [T] 
shows the distribution of the absolute value of pairwise inner products between columns of the DG{5, 1) 
sieve. All entries on the main diagonal are equal to 1, and around the the diagonal there are squares 
corresponding to translates of the Kerdock set K^- 

Table U shows that subsampling may increase the spectral norm. This will make it more difficult to 
reconstruct the signal either by chirp detection or by sieving the testimony of witnesses. We need to 
understand this increase in order to be able to apply the results of Section 




(a) Inner product between the first 512 columns of (b) Inner product between the first 256 columns of 

the DG{5, 1) matrix the DG{5, 1) matrix 

Fig. 1: The inner product between the columns of a DG{5, 1) matrix. The point at position shows 
the inner product between the columns ipi and ipj. Lighter color shows higher inner product value. 



C. Spectral Norm of DG Matrices 

Given a sensing matrix, the results presented in Section f|II] show that if the the worst case coherence 
and spectral norm are sufficiently small then io minimization has a unique solution which coincides with 
the solution of a convex LASSO program. The worst case coherence /i of the initial DG{m, r) frame 
satisfies /i < N~~^. To make sure that every row sum vanishes, we further exclude the m + 1 rows, 
indexed by powers of 2, from the DG sieve. This exclusion changes the worst case coherence by at most 
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^^^^^ ^Now n < N~^2 + ^^^^j- The experimental results presented below suggest that the number of 
pairs of rows in a DG sieve that fail to be orthogonal is very small. Removing these rows results in an 
equiangular tight frame that is not a union of orthonormal bases. 

Table U lists the spectral norm of DG{m, r) frames and DG{m, r) sieves for m = 3, 5, 7 and 9. The 
spectral norm of a sieve is almost twice that of the corresponding frame and we shall see that the reason 
is a small number of duplicate rows. Removing these rows results in an equiangular tight frame. We now 
describe how to find these duplicate rows. 

Let x,y be two distinct elements of the finite field F^, and let '^{x), ip(y) denote the two rows in $ 
indexed by x and y. Setting y = x + e we obtain 

^{x)^^{y) = ^ = ^ i^'''''"^''''" (6) 

PeDG{m,r) P&DG{m,r) 
_ — I ^ ^2eP'(a)xTT+eP*(a)eTT 

If rows ip{x) and ip{y) are not orthogonal then each term in the product is nonzero. When t > we now 
show that the t^^ term in the product is a sum of linear characters. Since the index of summation ranges 
over the group, the sum is either zero or the linear character is trivial (each term in the sum is equal to 
1). 

Lemma 2. Let t > 1 and let x and x+ebe two distinct elements of¥^. Then either X^asF™ 
is zero, or for every field element a: {x + e)P*(a)(x + e)"'" — xP^{a)x~^ = ( mod 4). 

Proof: When t > every matrix P^{a) has zero diagonal and the map a — )• (e + 2x)P*(a)e^ is a 
linear map from the additive group F™ to 2Z4. If this map is not identically zero then the character sum 
vanishes. ■ 

The next proposition follows from non-degeneracy of the trace. 
Proposition 3. If t > then for every field element f 



f P\a)f = 2Tr (f+^ a) + 2zJ^ ( mod 4) where Za 



Tr(e^(2'+i)a) j = 0,--- ,m 



(7) 

Proof: Since the quadratic forms fP^{a)f~^ and 2Tr (a/^'+'^) determine the same bilinear form they 
differ by a linear function 2za f~^- Since the quadratic form fP^{a)f~^ vanishes at all standard coordinate 
vectors we are able to determine the entries of the vector 2za that describes the linear function. ■ 

Next we use non-degeneracy of the trace to find duplicate rows (fix) and ip{x + e). 
Lemma 3. The existence of field elements x, e such that 

{x + e)P\a){x + ey -xP\a)x'^ = Q{mod A) for all a in ¥"1^ , (8) 
is equivalent to the existence of a solution | to the equation 

„t m— 1 / \ 2' + l 
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Proof: Since the trace is a Unear map we may replace ([8]l by the condition that for all a in F™ 

Tr a|^(x + e)2*+i+x2'+i + J]e,.^^(2Hi)j ^ q. 

Now the non-degeneracy of the trace implies that {x + e)^* + x^* ej^-'^^'"'"^) = 0. Expanding 
(x + e)^'"*"^, we ontain 

m— 1 

e2'+i + ^e^' + x^' e + e.e^'^^'+D = 0. 

Since e is non-zero, dividing the equation by +^ completes the proof. ■ 

The solutions to the equation z + z^* =0 form a subfield of F™ and the number of solutions is 
gcd (2* - 1, 2™ - 1). Note that when m is odd and t = 1 or t = 2, there are exactly two solutions (z = 
and z = 1). We now list the conditions satisfied by x and e if the row tf{x) is not orthogonal to the row 
+ e). 

Theorem 6. Let x and x + e be two distinct elements of the finite field F™. Then (/7(x)^(^(x + e) 7^ /f 
and only if the following conditions simultaneously hold: 



fCij For every t > 1; | + = 1 + 



Theorem [6] provides an efficient way for identifying the non-orthogonal rows of the sieve matrices without 
requiring to calculate the gram matrices $t<I> explicitly. For every element e, we first find the solution 
for the case t = 1. If such a solution exists then we just need to check that condition (CI) is valid for 
other values of t. If all conditions passed then we just verify condition (C2). This method significantly 
reduces the computational cost of eliminating the non-orthogonal rows. 

The next formula is for t = 1 

^- + r-f = X whereA = l + ?-°'^^-^''' 



e V e / 



This is a quadratic equation with roots - and - + 1 where - = Yli t odd ■ On the other hand 

^ ^ ^ l<£<m~2 

A + A = — h - = a where a = I -\ — 



e ve/ 



Thus we can also retrieve the explicit solution A = ^ i-odd o. ■ In other words, the following 

l<i<m-2 

equivalence between the two field elements (which are both functions of e) must be satisfied: 

E 1 + ^^^ =1 + ^^^- (10) 

e-. odd \ / 

l<£<m-2 

Remark 1. Solutions to condition (CI) correspond to codewords of weight 2 in the binary code that is 
dual to the code determined by matrices in DG{m,r) with zero diagonal. The number of solutions can 
be calculated using the MacWilliams Identities and we provide details in Appendix ^ 



Table |ll] records the number of duplicate measurements that need to be deleted in order to transform a 
DG{m, 1) sieve into a tight frame. We calculated the number of duplicate rows for DG{m,2), where 
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m < 15, and found that there were no solutions to (ITOl) that also satisfied (C2); that is all DG{'m, 2) 
sieves with m < 15 are tight frames. Hence 

Conjecture: Every DG{m, r) sieve with r > 2 is a tight-frame. 

Figure |2] displays for m = 7 and 9 the average condition number of a random N x k submatrix of the 
DG{m, 1) sieve and the DG{m, 0) frame. The spectral norm of the hollow gram matrix ||<1>1^<I> — lAr||2 was 
calculated for 2000 randomly chosen submatrices and the average was recorded. The comparison with 
Gaussian sensing matrices was made by drawing 10 iid Gaussian matrices, calculating for each matrix 
the average spectral norm over randomly chosen submatrices, and then recording the median value. 

TABLE II: Number of row deletions required to transform a DG{m, 1) sieve into a tight frame. 



DG{m,l) 


m = 5 


m = 7 


m = 9 


m = 11 


m = 13 


m = 15 


# of non-orthogonal rows 


11 


25 


45 


83 


203 


381 


% of non-orthogonal rows 


0.3438 


0.1953 


0.0879 


0.0405 


0.0248 


0.0116 



0.9 
^ 0.85 
I 0.8 

Z 0.75 

o 

§ 0.65 
I 0.6 
i 0.55 

I 

0.5 

0.45 

0.4"*^ 
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20 30 35 40 45 50 55 60 65 70 
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Fig. 2: Average spectral norm of ^\^k — Ifcxfc> where is a random sub dictionary of $. Here the 
comparison is between Gaussian, DG{m,l) sieve, and DG{m,0) base matrices. Each experiment is 
repeated 2000 times. 



Remark 2. Here we compare the empirical results of Figure |2] with the theoretical results of Theorem |2] 
First we considered the DG{7, 0) frame, with C = 2^^ and N = 2^. The worst case coherence of $ is 
fi = 2~2, and the square of the spectral norm of <I> is 2"^. So the constant c in Theorem [3] needs to be 
at least /i logC = ~ 0.85. Hence, as long as k is at most ^f^^^^l^ ^ 11, Theorem [2] predicts 

probability of non-uniqueness on the order of 2~^^. Experimental results presented in Figure [2a] are more 
positive; all 2000 trials resulted in sub-dictionaries with full rank, even for k as large as 20. 

Next we considered the DG{7, 1) sieve with C = 2^^ and N = wM The worst case coherence of $ is 
'The 25 duplicate rows were removed from the matrix. 



11 



/i 2 2 , and the square of the spectral norm of $ is H^'P ~ ^103^ — 159.6. As a result, the constant c 
needs to be at least ^'^^^^ ?a 1.70. Therefore, as long as k is less than ^'™i^g2^ ~ 10 Theorem |2] predicts 
probability of non-uniqueness on the order of 2~^^. Again, we see that the theoretical bound is not tight, 
and for k as large as 20 all trials provide uniqueness of sparse representation. 
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Fig. 3: Average nuclear norm Yli=i '^ij of random sub-dictionaries of of DG{7, 1) and Gaussian 
matrices of the same size as a function of the sparsity level k. 



Remark 3. The bounds of Proposition [T] only apply to the condition number of random submatrices 
and do not provide additional information about the distribution of eigenvalues. However Gurevich and 
Hadani ll20ll have analyzed the spectrum of certain incoherent dictionaries that are unions of disjoint 
orthonormal bases. They have shown that the eigenvalues of the Gram matrix of a random subdictionary 
are asymptotically distributed around 1 according to the Wigner semicircle law. Our experimental results 
suggest that this property is shared by DG sieves which are not unions of orthonormal bases. Figure [3] 
shows that the distribution of the singular values of a random submatrix of a DG sieve is symmetric 
around 1, and very similar to the distribution for a Gaussian matrix of the same size. 



IV. Numerical Experiments 

In this Section we present numerical experiments to evaluate the performance of the DG frames and 
sieves. The performance of DG frames and sieves is compared with that of random Gaussian sensing 
matrices of the same size. The SpaRSA algorithm fT5\ with £1 regularization parameter A = 10~^ is 
used for signal reconstruction in the noiseless case, and the parameter is adjusted according to Theorem [3] 
in the noisy case. The reason for using SpaRSA is that is designed to solve complex valued LASSO 
programs. 

Remark 4. Given a random sensing matrix satisfying RIP, it is known that Basis Pursuit leads to 
more accurate reconstruction than the LASSO [1|. It is for this reason that we also compare results 
for LASSO applied to DG matrices with results for Basis Pursuit applied to Gaussian matrices. The 
^1 -magic package II2TI is used to solve the Basis Pursuit optimization program. The results for Gaussian 
matrices shown in Figure |4] are consistent with the observation made in lf22l that when the signal is 
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(a) Average fraction of the support that is reconstructed successfully (b) Average reconstruction time in the noiseless regime for different 
as a function of the sparsity level k sensing matrices. 

Fig. 4: Comparison between DG{7, 0) frame, DG{7, 1) sieve, and Gaussian matrices of the same size 
in the noiseless regime. The regularization parameter for LASSO is set to 10"^. 
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(a) The impact of the noise in the measurement domain on the accuracy (b) The impact of the noise in the data domain on the accuracy of the 
of the sparse approximation for different sensing matrices. sparse approximation for different sensing matrices. 

Fig. 5: Average fraction of the support that is reconstructed successfully as a function of the noise 
level in the measurement domain (left), and in the data domain (right). Here the sparsity level is 14. 
The regularization parameter for LASSO is determined as a function of the noise variance according to 
Theorem |3] 



not very sparse, interior point methods (li - magic) are less sensitive than gradient descent methods 
(SpaRSA) 

For Gaussian matrices, we sampled 10 iid random matrices independently to eliminate the exponentially 
small chance of getting a sample <I> with jj, = ui (N) or ||<I>|p = uj (^), and the median of the results 



13 



among all 10 random matrices is reported. The use of 10 random trials to eliminate pathological sensing 
matrices is standard practice (see lITTI for example). 

The experiments relate accuracy of sparse recovery to the sparsity level and the Signal to Noise Ratio 
(SNR). Accuracy is measured in terms of the statistical 0—1 loss metric which captures the fraction of 
signal support that is successfully recovered. The reconstruction algorithm outputs a /c-sparse approxi- 
mation a to the fc-sparse signal a, and the statistical — 1 loss is the fraction of the support of a that 
is not recovered in a. Each experiment was repeated 2000 times and Figure |4] records the average loss. 

Figure |4] plots statistical — 1 loss and complexity (average reconstruction time) as a function of the 
sparsity level k. We select fc-sparse signals with uniformly random support, with random signs, and 
with the amplitude of non-zero entries set equal to 1. Three different sensing matrices are compared; a 
Gaussian matrix, a DG{7, 0) frame and a DG{7, 1) sieve. After compressive sampling the signal support 
is recovered using the SpaRSA algorithm with A = 10~^. For random matrices the signal support is also 
recovered by £i -minimization. 

Figure l5al plots statistical — 1 loss as a function of noise in the measurement domain and Figure [5b] does 
the same for noise in the data domain. In the measurement noise study, a J\f{0, a"^) iid measurement noise 
vector is added to the sensed vector to obtain the N dimensional vector /. The original fc-sparse signal 
a is then approximated by solving the LASSO program with A = 2 -v/2 log Co"^, and basis pursuit with 
e = 2Na'^. Following Lemma [T] we use a similar method to study noise in the data domain. Figure [5] 
shows that DG frames and sieves outperform random Gaussian matrices in terms of noisy signal recovery 
using the LASSO. 

V. Conclusion 

We have constructed two families of deterministic sensing matrices, DG{m, r) frames and DG{m, r) 
sieves, by exponentiating codewords from Z4 - linear Delsarte-Goethals codes. We have verified that the 
worst-case coherence and the spectral norm of these sensing matrices satisfy the conditions necessary 
for uniqueness of sparse representation and fidelity of ii reconstruction via the LASSO algorithm. We 
have presented numerical results that confirm performance predicted by the theory. These results show 
that DG frames and sieves outperform random Gaussian matrices in terms of noiseless and noisy signal 
recovery using the LASSO. Our focus here is on ii reconstruction using the LASSO algorithm but we 
note that the particular structure of the DG matrices leads to faster algorithms and to additional features 
such as local decoding and stronger guarantees on resilience to noise in the data domain. 
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Appendix A 
The Number of Solutions of Condition (CI) 



Let DGo{m,r) denote the set of all zero-diagonal matrices in DG{m,r): 

DGoim, r) = I ^ P\at) \at £ ¥^ t = 1, ■ ■ ■ ,r 



. t=i 



For every matrix P in DGo{m, r), the vector xPx is a codeword of the linear binary code DGo{m, r) 
which is a sub-code of the Delsarte-Goethals code. Note that DGo{m,r) has 2''™ codewords of length 
2"^. The following lemma shows how the number of solutions to (CI) is related to the properties of this 
binary code. 



Lemma 4. Let {Wq, • • • , Wn} denote the weight distribution of DGo{m, r). Then the number of pairs 
(x, X + e) satisfying (CI) is equal to 

1 ^ 

i^Y.^^^^^^^ (11) 

i=0 

where K,i{z) is the Krawtchouk polynomial, defined as 
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Proof: Lemma [3] implies that the number pairs {x,x + e) satisfying Condition (CI) is equal to 
the number of duplicate rows in DG(){m,r). The condition that the rows x and x + e are identical is 
equivalent to the condition that the vector with entry 1 in positions x and x -\- e, and zero elsewhere 
belongs to the dual code. The lemma now follows from the Mac Williams Identities [23] that relate relate 
the number of codewords of weight 2 in the dual of DGo{m, r) to the weight distribution of DGo{m, r). 

■ 

Next we show that for the case r = 1, the number of solutions to (CI) only depends on the number of 
codewords with weight 2™"^ in DGo{m, 1): 

Theorem 7. Let m be an odd number and let r equal 1. Then the number of solutions to (CI) is 2™ — 1 — s 
where s is the number of codewords with weight 2™~^ in DGo{m, 1). 

Proof: We start by calculating the rank of matrices in DGo{m, 1): Let a be a fixed element of F™. 
A field element x is in the null space of Pa if and only if for every field element y, xPaif^ = 0. Using 
Equation [3l this condition can be translated to the condition 

Tr ((xy^ + x'^y)a) = for all y. 
Since Tr(x) = Tr(a;^) the condition further reduces to 

Tr [ixa + x'^a^)y^) = for all y. 
Non-degeneracy of the trace implies that + - = 0, which, since m is odd, has the unique solution 

a 

Now let S = XlzeF" z^^"^^. Since xP^x^ is a binary codeword, we have 5^ = {N — 2wa)^, where Wa is 
the weight of the codeword determined by Pa- It has been proved in ifTTl that S"^ = 2™ Xle eP =o ^^^"'^ ■ 
We provide the proof here for completeness: 

We have 



x.y x,y 

Changing variables to z = x (B y and y gives 

z y z:zPa,=0 

The null space of Pa has only two elements and a 3 . As a result 



There are two cases; 5^ is either or 2*"+^. 

Case 1: 5 is zero. This case provides one possible weight value: Wa = 2™~^. 

Case 2: \S\^ = 2"^+^. Therefore 2™ - 2wa = ±2^^. This case provides two distinct weight values: 
Wa = 2""-i ±2"^. 

Hence DGo{m, 1) has exactly four distinct weights (0, 2™"^ -2^^, 2''"-\ 2™-^ +2^^). Let s, t') 
denote the corresponding weight distribution. We can use the MacWilliams identities to find the values 
of t and t' as a function of s. First, note that the dual code has exactly one codeword of weight 0. Using 
MacWilliams identities with Krawtchouk polynomial 1Cq{z) = 1, gives the equation l + t + s + t' = C. 
Second, since all matrices in DGo{m,r) are zero-diagonal, for every field element a and for every 
index j in {0, • • • ,m}, PaS,-^~^ = 0, the dual code has exactly m + 1 codewords of weight 1. Again, 
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Mac Williams identities, with Krawtchouk polynomial K,i[z) = N — 2z gives the equation (m + 1)A^ = 
N + V2N{t' — t). This equation can be simplified to t — t' = m2~2~. Solving t and t' with respect 
to s gives t = 2'"-i-g+m2^2- ^/ _ 2"'-i-s-m2^~ ^ rpj^^ theorem then follows from substituting 
the values t,s,t' into Equation (fT2l) . and simplifying the expression using the Krawtchouk polynomial 
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